Objective:
This is a notebook that aims to detect the presence of food in images! In addition, the Grad-CAM is used to visualize the most important features of the image when the model is classifying.
Methodology:
5 approaches were used for training and evaluating:
Results
Training acc | Test accuracy:
Image classification tasks has long been studied and it is an important field of machine learning and artificial intelligence. Normally associated with complex models, detecting and correctly classifying a figure/image is not trivial for the computer as it is for us, humans. For that reason, several models exist nowadays and deep neural network gained its space with the advancements of micro-processors, overcoming time- and memory-constraints.
Despite the success of using convolutional neural networks for dechyphering and understanding the sublte meanings of an image, simple and light classification models (e.g. SVM, Linear Regression) have always caught the attention for their interpretability and are still explored to modelize high-dimensional problems (such image classification).
But the problem scope goes beyond of "how much complex a model is". For instance, Joutout et al. [2] proposed a SVM classifier for such purpose and, despite the "simplicity" of the model, data acquired to predict spherical fruits involves a laser-scanning to obtain reflectance and range precision, outcoming the fruit shape and color.
Particularly, recognizing the presence of food items on images is also a challenge and Jimenez et al. proposed one of the first methods back on 1999 [1].
Alghough the difficult to classify food items, as they are strogly related to color and shape [3], novel methods have been tested and combination of multiples CNN models can already predict Mediterranean Diet food items with an accuracy of 52,71% [4]
References:
[1] Joutou, T., Yanai, K.: A food image recognition system with multiple kernel learning. In IEEE International Conference in Image Processing, pp. 285–288 (2009)
[2] Farinella, G. M., Allegra, D., Stanco, F., & Battiato, S. (2015, September). On the exploitation of one class classification to distinguish food vs non-food images. In International Conference on Image Analysis and Processing (pp. 375 -383). Springer, Cham
[3] Farinella, G. M., Allegra, D., & Stanco, F. (2014, September). A benchmark dataset to study the representation of food images. In European Conference on Computer Vision (pp. 584 - 599). Springer, Cham
[4] Papathanail, Ioannis; Lu, Ya; Vasiloglou, Maria; Stathopoulou, Thomai; Ghosh, Arindam; Faeh, David; Mougiakakou, Stavroula (March 2021). FOOD RECOGNITION IN ASSESSING THE MEDITERRANEAN DIET: A HIERARCHICAL APPROACH (Unpublished). In: 14th International Conference on Advanced Technologies & Treatments for Diabetes
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import pandas as pd
import os, pickle, cv2
import seaborn as sns; sns.set()
from numpy import asarray
import PIL
from PIL import Image
### sklearn libraries
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score, log_loss
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.model_selection import train_test_split
### tensorflow libraries
import tensorflow
from keras import applications
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.preprocessing import image
from tensorflow.keras.layers import BatchNormalization
from keras.models import Sequential, load_model
from tensorflow.keras.layers import Conv2D, Activation, Flatten, Dropout, Dense
# import tensorflow
from tensorflow.keras import backend as K
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau, EarlyStopping
from tensorflow.keras.optimizers import RMSprop, Adagrad, Adam
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
from google.colab import files
!cp /content/drive/MyDrive/foodtask/gradcam.py /content/
!cp /content/drive/MyDrive/foodtask/getalldata.py /content/
from gradcam import *
from getalldata import *
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
import shutil
try:
shutil.rmtree("/content/food")
except:
pass
try:
shutil.rmtree("/content/food_pre")
except:
pass
os.mkdir("food")
# !unzip "/content/drive/MyDrive/foodtask/Train.zip" -d /content/food/
# !unzip "/content/drive/MyDrive/foodtask/Valid.zip" -d /content/food/
# !unzip "/content/drive/MyDrive/foodtask/Test.zip" -d /content/food/Test/
Although the images source for training and testing our models comes all from the same dataset (TRAIN and TEST folder), we prepared 3 types of processed data in order to train the different models.
The data folder TRAIN was split into train and validation dataset.
The data folder TEST was entirely used for test evaluation.
Regardless of the model, due to computational process limits, the images had to be resized in 1/4 the original (from 240x320 to 60x80).
Furthermore, for training the new CNN, the images had also to be converted to grayscale. Despite this is a topic still in discussion in the scientific community, the accuracy for both methods (RGB and grayscale) don't differ too much [5][6] (check the image below).
data_inet: for VGG16 model. input : (samples, 60, 80, 3)
data: for own architecture. input : (samples, 60, 80, 1)
data_stack : for simple DNN. input : (samples, 4800)
For more information about the differences training CNN with grayscale and RGB images:
[5] Convolutional neural network for human micro-Doppler classification
[6] Color-to-Grayscale: Does the Method Matter in Image Recognition?
To save time, data were processed in a first time (refer to getalldata.py script) and downloaded for ulterior uses.
data = {}
data_inet = {}
w = 80
h = 60
metadata = {"Train/":{0:0, 1:0}, "Valid/":{0:0, 1:0}}
right = True
# data
try:
a_file = open("/content/drive/MyDrive/foodtask/data.pkl", "rb")
data = pickle.load(a_file)
except:
right = False
# data_inet
try:
a_file = open("/content/drive/MyDrive/foodtask/data_inet.pkl", "rb")
data_inet = pickle.load(a_file)
except:
right = False
# data_stack
try:
a_file = open("/content/drive/MyDrive/foodtask/data_stack.pkl", "rb")
data_stack = pickle.load(a_file)
except:
right = False
if not right:
data, data_inet, data_stack = getall()
# get y_valid and y_train
y = True
try:
a_file = open("/content/drive/MyDrive/foodtask/y_train.pkl", "rb")
y_train = pickle.load(a_file)
except:
y = False
try:
a_file = open("/content/drive/MyDrive/foodtask/y_valid.pkl", "rb")
y_valid = pickle.load(a_file)
except:
y = False
if not y:
y_train = np.concatenate( (np.zeros((metadata["Train/"][0],1)), np.ones((metadata["Train/"][1],1))) )
y_valid = np.concatenate( (np.zeros((metadata["Valid/"][0],1)), np.ones((metadata["Valid/"][1],1))) )
y_train.shape, y_valid.shape
((13113, 1), (3279, 1))
data_inet["Train/"] = np.array(data_inet["Train/"])
data_inet["Train/"] = preprocess_input(data_inet["Train/"])
data_inet["Valid/"] = np.array(data_inet["Valid/"])
data_inet["Valid/"] = preprocess_input(data_inet["Valid/"])
The code for the Grad-CAM used can be found on the script gradcam.py
References:
https://medium.com/@daniel.reiff2/understand-your-algorithm-with-grad-cam-d3b62fce353
https://medium.com/@stepanulyanin/implementing-grad-cam-in-pytorch-ea0937c31e82
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra https://arxiv.org/abs/1610.02391
Function to make single predictions while visualizing the image.
Two possibilities were taken into account with respect to different models predicting:
While the new CNN model could predict directly from the image input, the pre-trained models had first to pass the image through the base convolutional layers before predicting.
from keras.utils.generic_utils import class_and_config_for_serialized_keras_object
import random
test_dir = "/content/food/Test/evaluation"
val_dir = "/content/food/Valid/"
def make_prediction(classifier, img_path, conv, image_re):
classes_x,predict = None, None
# Extract features
if conv:
image = cv2.imread(img_path)
img = cv2.resize(image,(w,h))
img_tensor = preprocess_input(np.copy(img))
if img_tensor.shape[-1] == 3:
features = conv.predict(img_tensor.reshape(1,h, w, 3))
else:
features = conv.predict(img_tensor.reshape(1,h, w, 1))
if "SGDC" in str(type(classifier)):
features = features.reshape((-1,1024))
# Make prediction
try:
prediction = classifier.predict(features)
except:
prediction = classifier.predict(features.reshape(1, 60*80*128))
img_tensor /=255.
else:
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.resize(img,(80,60))
gs_image = img.reshape((1,h,w,1))
img_tensor = asarray(gs_image) # /255.0
try:
prediction = classifier.predict(img_tensor)
classes_x=np.argmax(prediction,axis=-1)
except:
prediction = classifier.predict(img_tensor.reshape(1,60*80*128))
classes_x=np.argmax(prediction,axis=-1)
predict = classifier.predict(img_tensor, steps=None, callbacks=None, max_queue_size=10, workers=1,use_multiprocessing=False, verbose=0
)
return prediction,classes_x,predict, img_tensor
If CNN are largely used to perform image classification tasks, DNN were the basis for the first models learning to recognize images. For the sake of curiosity, a simple (really simple) DNN is built here.
It is composed of no more than dense layers!
</br>Parameters
batch-size = 32
rlrop patience = 50
optimizer = Adam()
epochs = 500
</br>Tuning the parameters
In the first approaches, the training accuracy increased until 0.8, falling suddenly to a stead value of 0.5 until the end of training.
Taking into account the problem of vanishing gradients and local maximum, a possible solution for such behavior could be a variant learning rate, that takes into account an "unchanged accuracy value" through a certain period of training.
For that reason, a ReduceLRonPlateau algorithm was used in order to scheduled change the learning rate when reaching a plateau of learning.
This solution solved the problem with a factor change of 0.001 in the learning rate for each 50 epochs showing no accuracy improvement.
# # UNZIP AND GET MODELs
!unzip "/content/drive/MyDrive/foodtask/DNN1st.zip" -d /content/model
Archive: /content/drive/MyDrive/foodtask/DNN1st.zip creating: /content/model/content/food/firsttry/ creating: /content/model/content/food/firsttry/variables/ inflating: /content/model/content/food/firsttry/variables/variables.data-00000-of-00001 inflating: /content/model/content/food/firsttry/variables/variables.index creating: /content/model/content/food/firsttry/assets/ inflating: /content/model/content/food/firsttry/keras_metadata.pb inflating: /content/model/content/food/firsttry/saved_model.pb
model = tensorflow.keras.models.load_model("/content/model/content/food/firsttry")
model.summary()
Model: "sequential_8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_43 (Dense) (None, 32) 153632
dense_44 (Dense) (None, 30) 990
dense_45 (Dense) (None, 64) 1984
dense_46 (Dense) (None, 64) 4160
dense_47 (Dense) (None, 64) 4160
dense_48 (Dense) (None, 1) 65
=================================================================
Total params: 164,991
Trainable params: 164,991
Non-trainable params: 0
_________________________________________________________________
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.callbacks import ReduceLROnPlateau, EarlyStopping
from tensorflow.keras.optimizers import RMSprop, Adagrad, Adam
from keras.optimizers import gradient_descent_v2 as SGD
"""
Getting xtrain and ytrain from data_stack
"""
xtrain, xvalid = data_stack["Train/"][:,:-1], data_stack["Valid/"][:,:-1]
ytrain, yvalid = data_stack["Train/"][:,-1], data_stack["Valid/"][:,-1]
def getmodelDNN(h,w):
model = Sequential()
model.add(Dense(32, input_dim=h*w, activation='relu'))
model.add(Dense(30, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
opt = Adam()
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
return model
def trainmodelDNN(model, data):
xtrain, ytrain = data[0], data[1]
xvalid, yvalid = data[2], data[3]
# fit
rlrop = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=50)
history = model.fit(xtrain, ytrain,validation_split = 0.2, epochs=500, batch_size=256, callbacks=[rlrop])
model.save('/content/food/second')
# evaluate
_, accuracy = model.evaluate(xtrain, ytrain)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()
print('Accuracy: %.2f' % (accuracy*100))
_, acc = model.evaluate(xvalid,yvalid)
print('Accuracy: %.2f' % (acc*100))
return history, accuracy
# Data = [xtrain,ytrain, xvalid, yvalid]
# model = getmodelDNN(h,w)
# hist, acc = trainmodelDNN(model, Data)
bag_metadata = {}
xvalid,yvalid = data_stack["Valid/"][:,:-1],data_stack["Valid/"][:,-1]
# _, acc = model.evaluate(xvalid,yvalid)
y_pred = model.predict(xvalid)
y_pred = np.where(y_pred < 0.5, 0, 1)
cm = confusion_matrix(yvalid, y_pred)
dispDNN = ConfusionMatrixDisplay(confusion_matrix=cm,
display_labels=[0,1])
dispDNN.plot()
plt.grid(False)
plt.show()
cl_reportDNN = classification_report(yvalid,y_pred)
print(cl_reportDNN)
precision recall f1-score support
0.0 0.65 0.76 0.70 1601
1.0 0.73 0.62 0.67 1678
accuracy 0.69 3279
macro avg 0.69 0.69 0.68 3279
weighted avg 0.69 0.69 0.68 3279
bag_metadata["DNN"] = {"model": model, "conv": None, "metrics":[cm,cl_reportDNN], "data":(data_stack["Valid/"],y_valid)}
</br> Transfer Learning
"Standing on the shoulder of giants"
Although all the pre-trained models for image classification should have a great accuracy for most part of the classification tasks, the use of vgg16 was prioritized as its training dataset (Imagenet) has food images[7].
References:
For this model, we'll use the data in the 3 channels format (60, 80, 3) asvgg16 was trained with 3 channels
# !unzip "/content/drive/MyDrive/foodtask/trained_vgg16.zip" -d /content/model
# model = tensorflow.keras.models.load_model("/content/trained_vgg16 (1).h5")
model = tensorflow.keras.models.load_model("/content/drive/MyDrive/foodtask/vgg16/trained_vgg16.h5")
model.summary()
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten_9 (Flatten) (None, 1024) 0
dense_10 (Dense) (None, 256) 262400
dropout_5 (Dropout) (None, 256) 0
dense_11 (Dense) (None, 1) 257
=================================================================
Total params: 262,657
Trainable params: 262,657
Non-trainable params: 0
_________________________________________________________________
# PRE TRAINED MODEL VGG16
conv = tensorflow.keras.applications.VGG16(include_top=False,weights='imagenet')
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5 58892288/58889256 [==============================] - 0s 0us/step 58900480/58889256 [==============================] - 0s 0us/step
conv.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, None, None, 3)] 0
block1_conv1 (Conv2D) (None, None, None, 64) 1792
block1_conv2 (Conv2D) (None, None, None, 64) 36928
block1_pool (MaxPooling2D) (None, None, None, 64) 0
block2_conv1 (Conv2D) (None, None, None, 128) 73856
block2_conv2 (Conv2D) (None, None, None, 128) 147584
block2_pool (MaxPooling2D) (None, None, None, 128) 0
block3_conv1 (Conv2D) (None, None, None, 256) 295168
block3_conv2 (Conv2D) (None, None, None, 256) 590080
block3_conv3 (Conv2D) (None, None, None, 256) 590080
block3_pool (MaxPooling2D) (None, None, None, 256) 0
block4_conv1 (Conv2D) (None, None, None, 512) 1180160
block4_conv2 (Conv2D) (None, None, None, 512) 2359808
block4_conv3 (Conv2D) (None, None, None, 512) 2359808
block4_pool (MaxPooling2D) (None, None, None, 512) 0
block5_conv1 (Conv2D) (None, None, None, 512) 2359808
block5_conv2 (Conv2D) (None, None, None, 512) 2359808
block5_conv3 (Conv2D) (None, None, None, 512) 2359808
block5_pool (MaxPooling2D) (None, None, None, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
Extract the features for both TRAINING and VALIDATION
features_dataTrain = np.load('/content/drive/MyDrive/foodtask/vgg16/featuresTrain_vgg16SVM.npy')
features_dataVal = np.load('/content/drive/MyDrive/foodtask/vgg16/featuresValid_vgg16SVM.npy')
features_dataTrain.shape, features_dataVal.shape
((13113, 1024), (3279, 1024))
# Reference: https://github.com/SaideshwarKotha/Food-Image-Classifier/blob/master/FoodImage_Classification.ipynb
if len(features_dataTrain) == 0:
features_dataTrain, features_dataVal = features_vgg(data_inet)
n_cl0tr, n_cl1tr = 6404, 6709
n_cl0val, n_cl1val = 1601, 1678
y_trainvgg = np.concatenate( (np.ones((n_cl1tr,1)),np.zeros((n_cl0tr,1))) )
y_validvgg = np.concatenate( (np.ones((n_cl1val,1)),np.zeros((n_cl0val,1))) )
"""
Again, the validation set comes from the TRAIN folder provided, so to have the entire
TEST folder for testing
"""
features_dataTrain = features_dataTrain.reshape((-1,1,2,512))
features_dataVal = features_dataVal.reshape((-1,1,2,512))
xtrain, xvalid, ytrain, yvalid = train_test_split(features_dataTrain, y_trainvgg, stratify=y_trainvgg, test_size=0.2)
print('Number of data points in train data:', xtrain.shape[0])
print('Number of data points in cross validation data:', xvalid.shape[0])
# print('Number of data points in test data:', X_test.shape[0])
Number of data points in train data: 10490 Number of data points in cross validation data: 2623
!unzip "/content/drive/MyDrive/foodtask/vgg16/vgg16.zip" -d /content/vgg16final
model = tensorflow.keras.models.load_model("/content/vgg16final/vgg16")
Archive: /content/drive/MyDrive/foodtask/vgg16/vgg16.zip creating: /content/vgg16final/vgg16/ creating: /content/vgg16final/vgg16/assets/ inflating: /content/vgg16final/vgg16/keras_metadata.pb inflating: /content/vgg16final/vgg16/saved_model.pb creating: /content/vgg16final/vgg16/variables/ inflating: /content/vgg16final/vgg16/variables/variables.data-00000-of-00001 inflating: /content/vgg16final/vgg16/variables/variables.index
# Reference: https://github.com/pmarcelino/blog/blob/master/dogs_cats/dogs_cats.ipynb
epochs = 50
def getmodelfully():
modelf = Sequential()
# model.add(Flatten(input_shape=(7,7,512)))
modelf.add(Flatten(input_shape=((1,2,512))))
modelf.add(Dense(256, activation='relu', input_dim=(1024)))
modelf.add(Dropout(0.5))
modelf.add(Dense(1, activation='sigmoid'))
modelf.summary()
modelf.compile(optimizer=Adam(),
loss='binary_crossentropy',
metrics=['acc'])
return modelf
def trainmodelfully(modelf,data,epochs):
xtrain, ytrain = data[0],data[1]
xvalid, yvalid = data[2], data[3]
batch_size = 32
history = modelf.fit(xtrain, ytrain,
epochs=epochs,
batch_size=batch_size,
validation_data=(xvalid, yvalid))
modelf.save("trained_vgg16final")
return history
model2 = getmodelfully()
hist2 = trainmodelfully(model2,[xtrain,ytrain,xvalid,yvalid],epochs)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 1024) 0
dense (Dense) (None, 256) 262400
dropout (Dropout) (None, 256) 0
dense_1 (Dense) (None, 1) 257
=================================================================
Total params: 262,657
Trainable params: 262,657
Non-trainable params: 0
_________________________________________________________________
Epoch 1/50
328/328 [==============================] - 2s 5ms/step - loss: 1.2587 - acc: 0.8730 - val_loss: 0.3271 - val_acc: 0.9100
Epoch 2/50
328/328 [==============================] - 2s 5ms/step - loss: 0.2379 - acc: 0.9219 - val_loss: 0.2566 - val_acc: 0.9211
Epoch 3/50
328/328 [==============================] - 2s 5ms/step - loss: 0.1474 - acc: 0.9484 - val_loss: 0.2510 - val_acc: 0.9260
Epoch 4/50
328/328 [==============================] - 2s 5ms/step - loss: 0.1086 - acc: 0.9611 - val_loss: 0.2531 - val_acc: 0.9325
Epoch 5/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0846 - acc: 0.9702 - val_loss: 0.2765 - val_acc: 0.9264
Epoch 6/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0685 - acc: 0.9743 - val_loss: 0.3053 - val_acc: 0.9253
Epoch 7/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0656 - acc: 0.9763 - val_loss: 0.3097 - val_acc: 0.9268
Epoch 8/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0602 - acc: 0.9775 - val_loss: 0.3553 - val_acc: 0.9295
Epoch 9/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0560 - acc: 0.9801 - val_loss: 0.3477 - val_acc: 0.9333
Epoch 10/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0549 - acc: 0.9827 - val_loss: 0.3619 - val_acc: 0.9260
Epoch 11/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0553 - acc: 0.9802 - val_loss: 0.3689 - val_acc: 0.9283
Epoch 12/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0610 - acc: 0.9817 - val_loss: 0.3775 - val_acc: 0.9264
Epoch 13/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0614 - acc: 0.9797 - val_loss: 0.3491 - val_acc: 0.9333
Epoch 14/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0571 - acc: 0.9818 - val_loss: 0.3481 - val_acc: 0.9318
Epoch 15/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0477 - acc: 0.9844 - val_loss: 0.3678 - val_acc: 0.9333
Epoch 16/50
328/328 [==============================] - 1s 4ms/step - loss: 0.0428 - acc: 0.9853 - val_loss: 0.3682 - val_acc: 0.9279
Epoch 17/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0585 - acc: 0.9827 - val_loss: 0.3476 - val_acc: 0.9329
Epoch 18/50
328/328 [==============================] - 1s 4ms/step - loss: 0.0440 - acc: 0.9862 - val_loss: 0.3794 - val_acc: 0.9318
Epoch 19/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0419 - acc: 0.9865 - val_loss: 0.3775 - val_acc: 0.9276
Epoch 20/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0318 - acc: 0.9896 - val_loss: 0.4172 - val_acc: 0.9302
Epoch 21/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0415 - acc: 0.9894 - val_loss: 0.4126 - val_acc: 0.9283
Epoch 22/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0348 - acc: 0.9896 - val_loss: 0.4705 - val_acc: 0.9306
Epoch 23/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0419 - acc: 0.9884 - val_loss: 0.4621 - val_acc: 0.9318
Epoch 24/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0430 - acc: 0.9875 - val_loss: 0.4538 - val_acc: 0.9276
Epoch 25/50
328/328 [==============================] - 1s 4ms/step - loss: 0.0286 - acc: 0.9908 - val_loss: 0.5120 - val_acc: 0.9306
Epoch 26/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0419 - acc: 0.9894 - val_loss: 0.5250 - val_acc: 0.9333
Epoch 27/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0283 - acc: 0.9910 - val_loss: 0.5394 - val_acc: 0.9318
Epoch 28/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0364 - acc: 0.9884 - val_loss: 0.5533 - val_acc: 0.9287
Epoch 29/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0272 - acc: 0.9922 - val_loss: 0.5360 - val_acc: 0.9302
Epoch 30/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0295 - acc: 0.9917 - val_loss: 0.5271 - val_acc: 0.9321
Epoch 31/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0268 - acc: 0.9918 - val_loss: 0.6200 - val_acc: 0.9291
Epoch 32/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0342 - acc: 0.9890 - val_loss: 0.6493 - val_acc: 0.9268
Epoch 33/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0345 - acc: 0.9914 - val_loss: 0.6208 - val_acc: 0.9291
Epoch 34/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0303 - acc: 0.9897 - val_loss: 0.5855 - val_acc: 0.9257
Epoch 35/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0321 - acc: 0.9906 - val_loss: 0.5971 - val_acc: 0.9279
Epoch 36/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0305 - acc: 0.9919 - val_loss: 0.6540 - val_acc: 0.9226
Epoch 37/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0270 - acc: 0.9934 - val_loss: 0.6505 - val_acc: 0.9295
Epoch 38/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0277 - acc: 0.9915 - val_loss: 0.6206 - val_acc: 0.9276
Epoch 39/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0332 - acc: 0.9913 - val_loss: 0.6231 - val_acc: 0.9299
Epoch 40/50
328/328 [==============================] - 1s 4ms/step - loss: 0.0246 - acc: 0.9929 - val_loss: 0.6721 - val_acc: 0.9291
Epoch 41/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0201 - acc: 0.9942 - val_loss: 0.7000 - val_acc: 0.9283
Epoch 42/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0262 - acc: 0.9920 - val_loss: 0.7166 - val_acc: 0.9260
Epoch 43/50
328/328 [==============================] - 1s 5ms/step - loss: 0.0258 - acc: 0.9923 - val_loss: 0.7225 - val_acc: 0.9260
Epoch 44/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0315 - acc: 0.9921 - val_loss: 0.6669 - val_acc: 0.9268
Epoch 45/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0266 - acc: 0.9919 - val_loss: 0.7103 - val_acc: 0.9291
Epoch 46/50
328/328 [==============================] - 2s 7ms/step - loss: 0.0231 - acc: 0.9941 - val_loss: 0.7682 - val_acc: 0.9295
Epoch 47/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0179 - acc: 0.9952 - val_loss: 0.8028 - val_acc: 0.9337
Epoch 48/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0228 - acc: 0.9936 - val_loss: 0.7309 - val_acc: 0.9295
Epoch 49/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0210 - acc: 0.9944 - val_loss: 0.8201 - val_acc: 0.9283
Epoch 50/50
328/328 [==============================] - 2s 5ms/step - loss: 0.0388 - acc: 0.9919 - val_loss: 0.9046 - val_acc: 0.9253
INFO:tensorflow:Assets written to: trained_vgg16final/assets
val_loss, val_acc = model.evaluate(features_dataVal,y_validvgg)
loss, acc = model.evaluate(xtrain,ytrain)
103/103 [==============================] - 0s 2ms/step - loss: 0.8702 - acc: 0.9076 328/328 [==============================] - 1s 2ms/step - loss: 0.1302 - acc: 0.9842
acc = hist2.history['acc']
val_acc = hist2.history['val_acc']
loss = hist2.history['loss']
val_loss = hist2.history['val_loss']
epochs = range(1, len(acc)+1)
plt.plot(epochs, acc, 'bo', label='Training accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()
from sklearn.metrics import plot_confusion_matrix
x_valid = features_dataVal
y_pred = model.predict(x_valid)
y_pred = np.where(y_pred < 0.5, 0, 1)
cm_fully = confusion_matrix(y_validvgg, y_pred)
dispfully = ConfusionMatrixDisplay(confusion_matrix=cm_fully,
display_labels=[0,1])
dispfully.plot()
plt.grid(False)
plt.show()
cl_reportfully = classification_report(y_validvgg,y_pred)
print(cl_reportfully)
precision recall f1-score support
0.0 0.90 0.91 0.91 1601
1.0 0.91 0.91 0.91 1678
accuracy 0.91 3279
macro avg 0.91 0.91 0.91 3279
weighted avg 0.91 0.91 0.91 3279
bag_metadata["vggfully"] = {"model": model, "conv": conv, "metrics":[cm_fully,cl_reportfully], "data":(features_dataVal,y_validvgg)}
Visualize predictions
import random
test_dir = "/content/drive/MyDrive/foodtask/ImagesTest/"
def visualize_predictions(classifier, n_cases, conv):
print(f'Classifier : {classifier}, and CONV : {conv}')
for i in range(0,n_cases):
path = test_dir
classe = (0 if i < 3 else 1)
path = test_dir + str(classe) + "/" + str(i) + ".jpg"
# Get picture
img_path = path # os.path.join(path, random_img)
prediction,_,_,img_tensor = make_prediction(classifier, img_path, conv, None)
print("prediction : ", prediction)
# img_tensor /= 255
# Show picture
plt.figure()
plt.grid(False)
plt.imshow(img_tensor)
plt.show()
if prediction < 0.5:
print('Not Food')
else:
print('Food!')
visualize_predictions(model, 6,conv)
Classifier : <keras.engine.sequential.Sequential object at 0x7f2044fdae50>, and CONV : <keras.engine.functional.Functional object at 0x7f204508fa10>
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
prediction : [[6.6945836e-07]]
Not Food
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
prediction : [[2.880809e-15]]
Not Food
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
prediction : [[1.486061e-05]]
Not Food
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
prediction : [[1.]]
Food!
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
prediction : [[0.9999908]]
Food!
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
prediction : [[0.9999994]]
Food!
model = tensorflow.keras.models.load_model("/content/drive/MyDrive/foodtask/vgg16/trained_vgg16.h5")
model.summary()
from sklearn.calibration import CalibratedClassifierCV
from sklearn.linear_model import SGDClassifier
data_train = features_dataTrain.reshape((-1,1024))
data_valid = features_dataVal.reshape((-1,1024))
# Linear classifiers (SVM, logistic regression, etc.) with SGD training.
clf = SGDClassifier(alpha=0.001, penalty='l2', loss='hinge', random_state=42)
clf.fit(data_train, y_trainvgg)
y_pred = clf.predict(data_valid)
cm_sgd = confusion_matrix(y_validvgg, y_pred)
dispfully = ConfusionMatrixDisplay(confusion_matrix=cm_sgd,
display_labels=[0,1])
dispfully.plot()
plt.grid(False)
plt.show()
cl_reportsgd = classification_report(y_validvgg,y_pred)
print(cl_reportsgd)
/usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py:993: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
precision recall f1-score support
0.0 0.88 0.89 0.89 1601
1.0 0.90 0.89 0.89 1678
accuracy 0.89 3279
macro avg 0.89 0.89 0.89 3279
weighted avg 0.89 0.89 0.89 3279
bag_metadata["vggsvm"] = {"model": clf, "conv": conv, "metrics":[cm_sgd,cl_reportsgd], "data":(features_dataVal,y_validvgg)}
# Probability calibration with isotonic regression or logistic regression
sig_clf = CalibratedClassifierCV(clf, method="sigmoid")
sig_clf.fit(data_train, y_trainvgg)
y_pred = sig_clf.predict(data_valid)
cm_sig = confusion_matrix(y_validvgg, y_pred)
dispfully = ConfusionMatrixDisplay(confusion_matrix=cm_sig,
display_labels=[0,1])
dispfully.plot()
plt.grid(False)
plt.show()
cl_reportsig = classification_report(y_validvgg,y_pred)
print(cl_reportsig)
/usr/local/lib/python3.7/dist-packages/sklearn/preprocessing/_label.py:98: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True) /usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py:993: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True) /usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py:993: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True) /usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py:993: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True) /usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py:993: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True) /usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py:993: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel(). y = column_or_1d(y, warn=True)
precision recall f1-score support
0.0 0.89 0.89 0.89 1601
1.0 0.90 0.90 0.90 1678
accuracy 0.89 3279
macro avg 0.89 0.89 0.89 3279
weighted avg 0.89 0.89 0.89 3279
# bag_metadata["vggsvm"] = {"model": model, "conv": conv, "metrics":[cm_sig,cl_reportsig], "data":(features_dataVal,y_validvgg)}
Visualize predictions
# visualize_predictions(clf, 6,conv)
# conv.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, None, None, 3)] 0
block1_conv1 (Conv2D) (None, None, None, 64) 1792
block1_conv2 (Conv2D) (None, None, None, 64) 36928
block1_pool (MaxPooling2D) (None, None, None, 64) 0
block2_conv1 (Conv2D) (None, None, None, 128) 73856
block2_conv2 (Conv2D) (None, None, None, 128) 147584
block2_pool (MaxPooling2D) (None, None, None, 128) 0
block3_conv1 (Conv2D) (None, None, None, 256) 295168
block3_conv2 (Conv2D) (None, None, None, 256) 590080
block3_conv3 (Conv2D) (None, None, None, 256) 590080
block3_pool (MaxPooling2D) (None, None, None, 256) 0
block4_conv1 (Conv2D) (None, None, None, 512) 1180160
block4_conv2 (Conv2D) (None, None, None, 512) 2359808
block4_conv3 (Conv2D) (None, None, None, 512) 2359808
block4_pool (MaxPooling2D) (None, None, None, 512) 0
block5_conv1 (Conv2D) (None, None, None, 512) 2359808
block5_conv2 (Conv2D) (None, None, None, 512) 2359808
block5_conv3 (Conv2D) (None, None, None, 512) 2359808
block5_pool (MaxPooling2D) (None, None, None, 512) 0
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
from gradcam import GradCam, superimpose, sigmoid, get_img_array
model = conv
conv2D_layers = [layer.name for layer in reversed(model.layers) if len(layer.output_shape) == 4 and isinstance(layer, tensorflow.keras.layers.Conv2D)]
activation_layers = [layer.name for layer in reversed(model.layers) if len(layer.output_shape) == 4 and layer.__class__.__name__ == 'ReLU']
all_layers = [layer.name for layer in reversed(model.layers) if len(layer.output_shape) == 4 and (layer.__class__.__name__ == 'ReLU' or isinstance(layer, tensorflow.keras.layers.Conv2D))]
for i in range(3):
img_path = "/content/drive/MyDrive/foodtask/ImagesTest/1/" + str(i) + ".jpg"
img_size = (60,80)
last_conv_layer_name = "block5_conv3"
# last_conv_layer_name = "batch_normalization_16"
# last_conv_layer_name = "conv2d_17"
ig = cv2.imread(img_path)
ig = cv2.resize(ig, (80,60))
img_array = asarray(ig)
img_array = preprocess_input(img_array)
img = img_array
img = np.expand_dims(img,0)
layer_name = last_conv_layer_name
grad_cam=GradCam(conv,img,layer_name)
if i == 1:
grad_cam_superimposed = superimpose(img_array.reshape((60,80,3)), grad_cam, 0.5, emphasize=True)
grad_cam_superimposed_vgg = grad_cam_superimposed
else:
grad_cam_superimposed = superimpose(img_array.reshape((60,80,3)), grad_cam, 0.5, emphasize=True)
prediction,classe_x,pred, _ = make_prediction(clf, img_path, conv, None)
img = cv2.imread(img_path)
plt.figure(figsize=(15, 15))
ax = plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.title('Original Image')
ax = plt.subplot(1, 2, 2)
plt.imshow(grad_cam_superimposed)
plt.axis('off')
plt.title(f'{last_conv_layer_name} Grad-CAM heat-map')
plt.tight_layout()
print("prediction : ", ("Not food" if prediction < 0.5 else "Food"), prediction)
prediction : Food [1.] prediction : Food [1.] prediction : Not food [0.]
grad-cam for all conv layers in the model
img_path = "/content/drive/MyDrive/foodtask/ImagesTest/1/1.jpg"
img_size = (60,80)
last_conv_layer_name = "block5_conv3"
ig = cv2.imread(img_path)
ig = cv2.resize(ig, (80,60))
img_array = asarray(ig)
img_array = preprocess_input(img_array)
img = np.expand_dims(img_array,0)
plt.figure(figsize=(12, 20))
for i, layer in enumerate(conv2D_layers):
grad_cam=GradCam(model,img,layer)
grad_cam_emphasized = superimpose(img_array.reshape((60,80,3)), grad_cam, 0.5, emphasize=True)
ax = plt.subplot(13, 4, i +1)
plt.imshow(grad_cam_emphasized)
plt.title(layer)
plt.axis("off")
plt.tight_layout()
datatr = data["Train/"]
datate = data["Valid/"]
(trainX, valX, trainY, valY) = train_test_split(np.array(datatr), y_train,
test_size=0.25, random_state=42)
trainX = trainX.reshape((len(trainX), h,w, 1))
valX = valX.reshape((len(valX), h,w, 1))
print(type(trainX), type(trainY))
<class 'numpy.ndarray'> <class 'numpy.ndarray'>
# UNZIP AND GET MODELs
!unzip "/content/drive/MyDrive/foodtask/OwnArchCNN.zip" -d /content/model
# model = tensorflow.keras.models.load_model("/content/trained_vgg16 (1).h5")
Archive: /content/drive/MyDrive/foodtask/OwnArchCNN.zip creating: /content/model/Model1/ creating: /content/model/Model1/assets/ inflating: /content/model/Model1/keras_metadata.pb inflating: /content/model/Model1/saved_model.pb creating: /content/model/Model1/variables/ inflating: /content/model/Model1/variables/variables.data-00000-of-00001 inflating: /content/model/Model1/variables/variables.index creating: /content/model/Valid/ creating: /content/model/Valid/0/ inflating: /content/model/Valid/0/0.jpg inflating: /content/model/Valid/0/1.jpg inflating: /content/model/Valid/0/2.jpg inflating: /content/model/Valid/0/3.jpg inflating: /content/model/Valid/0/4.jpg inflating: /content/model/Valid/0/5.jpg creating: /content/model/Valid/1/ inflating: /content/model/Valid/1/0.jpg inflating: /content/model/Valid/1/1.jpg inflating: /content/model/Valid/1/2.jpg inflating: /content/model/Valid/1/3.jpg inflating: /content/model/Valid/1/4.jpg inflating: /content/model/Valid/1/5.jpg
model = tensorflow.keras.models.load_model("/content/model/Model1")
# aug = ImageDataGenerator(rotation_range=20, zoom_range=0.15,
# width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,
# horizontal_flip=True, fill_mode="nearest")
def getmodel(h,w):
model = Sequential()
model.add(Conv2D(16, (3,3), padding = "same", activation = "relu", input_shape = (h,w,1)))
model.add(Conv2D(16, (3,3), padding = "same", activation = "relu"))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(32, (3,3), padding = "same", activation = "relu"))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(32, (3,3), padding = "same", activation = "relu"))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(128, (3,3), padding = "same", activation = "relu"))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(128, (3,3), padding = "same", activation = "relu"))
model.add(BatchNormalization(axis = -1))
model.add(Flatten())
model.add(Dense(32))
model.add(BatchNormalization())
model.add(Dense(1, activation = 'sigmoid'))
return model
def trainmodel(model,alldata,aug,epochs):
trainX, trainY = alldata[0], alldata[1]
valX, valY = alldata[2], alldata[3]
# xtest, ytest = alldata[4], alldata[5]
opt = Adam()
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
checkpoint_path = "content/food/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# Create a callback that saves the model's weights
cp = tensorflow.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_weights_only=True,
verbose=1)
# snippet of using the ReduceLROnPlateau callback and # simple early stopping
rlrop = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=150)
history = model.fit(x = aug.flow(trainX, trainY, batch_size=256),
validation_data = (valX,valY), epochs=epochs,
batch_size=256, callbacks=[rlrop,cp])
model.save('/content/food/second')
_, accuracy = model.evaluate(valX, valY)
return accuracy
# evaluate the keras model
aug = ImageDataGenerator()
alldata = [trainX,trainY,valX,valY]
epochs = 25
# model = getmodel(h,w)
# acc = trainmodel(model, alldata,aug,epochs)
x_valid = np.array(datate)
x_valid = x_valid.reshape((len(x_valid), h,w, 1))
y_pred = model.predict(x_valid)
y_pred = np.where(y_pred < 0.5, 0, 1)
cm_own = confusion_matrix(y_valid, y_pred)
dispfully = ConfusionMatrixDisplay(confusion_matrix=cm_own,
display_labels=[0,1])
dispfully.plot()
plt.grid(False)
plt.show()
cl_reportOwn = classification_report(y_valid,y_pred)
print(cl_reportOwn)
precision recall f1-score support
0.0 0.82 0.87 0.84 1601
1.0 0.87 0.82 0.84 1678
accuracy 0.84 3279
macro avg 0.84 0.84 0.84 3279
weighted avg 0.84 0.84 0.84 3279
bag_metadata["Own"] = {"model": model, "conv": None, "metrics":[cm_own,cl_reportOwn], "data":(x_valid,y_valid)}
bag_metadata.keys()
dict_keys(['DNN', 'vggfully', 'vggsvm', 'Own'])
model.summary()
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_12 (Conv2D) (None, 60, 80, 16) 160
conv2d_13 (Conv2D) (None, 60, 80, 16) 2320
batch_normalization_12 (Bat (None, 60, 80, 16) 64
chNormalization)
conv2d_14 (Conv2D) (None, 60, 80, 32) 4640
batch_normalization_13 (Bat (None, 60, 80, 32) 128
chNormalization)
conv2d_15 (Conv2D) (None, 60, 80, 32) 9248
batch_normalization_14 (Bat (None, 60, 80, 32) 128
chNormalization)
conv2d_16 (Conv2D) (None, 60, 80, 128) 36992
batch_normalization_15 (Bat (None, 60, 80, 128) 512
chNormalization)
conv2d_17 (Conv2D) (None, 60, 80, 128) 147584
batch_normalization_16 (Bat (None, 60, 80, 128) 512
chNormalization)
flatten_2 (Flatten) (None, 614400) 0
dense_4 (Dense) (None, 32) 19660832
batch_normalization_17 (Bat (None, 32) 128
chNormalization)
dense_5 (Dense) (None, 1) 33
=================================================================
Total params: 19,863,281
Trainable params: 19,862,545
Non-trainable params: 736
_________________________________________________________________
conv2D_layers = [layer.name for layer in reversed(model.layers) if len(layer.output_shape) == 4 and isinstance(layer, tensorflow.keras.layers.Conv2D)]
activation_layers = [layer.name for layer in reversed(model.layers) if len(layer.output_shape) == 4 and layer.__class__.__name__ == 'ReLU']
all_layers = [layer.name for layer in reversed(model.layers) if len(layer.output_shape) == 4 and (layer.__class__.__name__ == 'ReLU' or isinstance(layer, tensorflow.keras.layers.Conv2D))]
w = 80
h = 60
for i in range(3):
img_path = "/content/drive/MyDrive/foodtask/ImagesTest/1/" + str(i) + ".jpg"
img_size = (60,80)
last_conv_layer_name = "conv2d_17"
ig = cv2.imread(img_path)
#### For grayscale own architecture:
gs_image = cv2.cvtColor(ig, cv2.COLOR_BGR2GRAY)
gs_image = cv2.resize(gs_image,(80,60))
gs_image = gs_image.reshape((1,h,w,1))
img_array_gs = asarray(gs_image)
img = img_array_gs
img_array = preprocess_input(get_img_array(img_path, size=img_size))
layer_name = last_conv_layer_name
grad_cam=GradCam(model,img,layer_name)
if i == 1:
grad_cam_superimposed = superimpose(img_array.reshape((60,80,3)), grad_cam, 0.5, emphasize=True)
grad_cam_superimposed_own = grad_cam_superimposed
else:
grad_cam_superimposed = superimpose(img_array.reshape((60,80,3)), grad_cam, 0.5, emphasize=True)
prediction,classe_x,pred, _ = make_prediction(model, img_path, None, None)
img = cv2.imread(img_path)
plt.figure(figsize=(15, 15))
ax = plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.title('Original Image')
ax = plt.subplot(1, 2, 2)
plt.imshow(grad_cam_superimposed)
plt.axis('off')
plt.title('Conv_1 Grad-CAM heat-map')
plt.tight_layout()
print("prediction : ", ("Not food" if prediction < 0.5 else "Food"), prediction)
prediction : Not food [[0.3465631]] prediction : Food [[0.9901761]] prediction : Not food [[0.10886151]]
Visualize grad-cam for multiple conv layers
img_path = "/content/drive/MyDrive/foodtask/ImagesTest/1/1.jpg"
img_size = (60,80)
last_conv_layer_name = "conv2d_17"
ig = cv2.imread(img_path)
#### For grayscale own architecture:
gs_image = cv2.cvtColor(ig, cv2.COLOR_BGR2GRAY)
gs_image = cv2.resize(gs_image,(80,60))
gs_image = gs_image.reshape((1,h,w,1))
img_array_gs = asarray(gs_image)
img = img_array_gs
img_array = preprocess_input(get_img_array(img_path, size=img_size))
plt.figure(figsize=(12, 20))
for i, layer in enumerate(conv2D_layers):
grad_cam=GradCam(model,img,layer)
grad_cam_emphasized = superimpose(img_array.reshape((60,80,3)), grad_cam, 0.5, emphasize=True)
ax = plt.subplot(13, 4, i +1)
plt.imshow(grad_cam_emphasized)
plt.title(layer)
plt.axis("off")
plt.tight_layout()
Interestingly, we can see that the first conv layer conv2d_17 has a more defined gradient output (a more homogenous gradient of color) whereas the subsequents layers, seem to learn other patterns beyond the food shape (we can see some activation importance coming from the plate as well!!)
from sklearn.metrics import plot_confusion_matrix
plt.figure(figsize=(12, 20))
# plt.grid(False)
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(15,10),sharex=True, sharey=True)
j = 0
for i,key in enumerate(bag_metadata.keys()):
cm = bag_metadata[key]["metrics"][0]
dispfully = ConfusionMatrixDisplay(confusion_matrix=cm,
display_labels=[0,1])
if i == 2:
j +=1
i -= 2
elif i == 3:
i -=2
disp = dispfully.plot(
cmap='viridis', ax=axes[i][j])
axes[i][j].set_title(key)
axes[i][j].grid(False)
plt.title(key)
# plt.tight_layout()
img_path = "/content/drive/MyDrive/foodtask/ImagesTest/1/1.jpg"
img_size = (60,80)
img = cv2.imread(img_path)
plt.figure(figsize=(15, 15))
ax = plt.subplot(1, 3, 1)
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.title('Original Image')
ax = plt.subplot(1, 3, 2)
plt.imshow(grad_cam_superimposed_vgg)
plt.axis('off')
plt.title(f'Block5_conv3 vgg16 archit')
ax = plt.subplot(1, 3, 3)
plt.imshow(grad_cam_superimposed_own)
plt.axis('off')
plt.title(f'Conv2d_17 Own archit')
plt.tight_layout()
print("prediction : ", ("Not food" if prediction < 0.5 else "Food"), prediction)
prediction : Not food [[0.10886151]]
Due to time and computational process constraints, I'll build several new cnn model with only 20 epochs, changing some parameters like:
** the kernel_size:
Borrowing the concept of human learning from baby to adult when deciphering an image and adapting for this task, I will try some different patterns through the layers of the model, establishing 3 parts of "learning":
Below you can find the trained models and their parameters.
Lets now check some graphs for the val_loss and val_accuracy for these round of trained models.
dir = "/content/drive/MyDrive/foodtask/histories/"
import plotly.graph_objects as go
y = np.arange(1,20,1)
colors = ["red", "green", "black", "gray", "yellow", "orange", "purple", "brown","blue"]
plt.figure(figsize=(15,15))
i = 0
fig = go.Figure()
for each in os.listdir(dir):
if "Dict" in each:
filer = "/content/drive/MyDrive/foodtask/histories/" + str(each)
hist = pickle.load(open(filer,"rb"))
if len(each) == 18:
mod_n = each[-2:]
else:
mod_n = each[-1]
### Uncomment the line bellow to check the validation metrics
# fig.add_trace(go.Scatter(x=y, y=np.array(hist["val_accuracy"]), name = ("val acc model " + mod_n), line = dict(color = colors[i])))
fig.add_trace(go.Scatter(x=y, y=np.array(hist["accuracy"]), name = ("train acc model " + mod_n), line = dict(color = colors[i], dash = 'dash') ))
# plt.ylim([0,1])
i+= 1
# plt.legend()
fig.show()
<Figure size 1080x1080 with 0 Axes>
If you can visualize the INTERATIVE Plotly graph, check the satatic one below
For models 5 and 11:
From these first round of attempts, a final CNN version (called own2) will be trained with more layers added and 50 epochs training with data augmentation.
Although the choosing of the model doesn't follow any strict criterium, I'll give priority to data augmentation and to not overfitting.
Furthermore, as the models were simple and trained with only 20 epochs, I don't think that they would have a great performance difference between them WHEN training with more epochs in a higher complex architecture.
Neverthelles, respecting the above criterium, models 5 and 11 are chosen and a blend of their parameters will be made.
I added now 8 more layers (4 conv2d and 4 batch_normalization) and boosted up the final dense layer's node (from 32 to 128)
Final accuracies:
tr acc : 0.89
val acc : 0.75
from tensorflow.keras.regularizers import l2
datatr = data["Train/"]
datate = data["Valid/"]
(trainX, valX, trainY, valY) = train_test_split(np.array(datatr), y_train,
test_size=0.25, random_state=42)
trainX = trainX.reshape((len(trainX), h,w, 1))
valX = valX.reshape((len(valX), h,w, 1))
print(type(trainX), type(trainY))
aug = ImageDataGenerator(rotation_range=20, zoom_range=0.15,
width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,
horizontal_flip=True, fill_mode="nearest")
aug_val = ImageDataGenerator(rotation_range=20, zoom_range=0.15,
width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,
horizontal_flip=True, fill_mode="nearest")
def getmodel2(h,w, reg, init):
model = Sequential()
model.add(Conv2D(16, (3,3), padding = "same", activation = "relu", input_shape = (h,w,1), kernel_regularizer = reg))
model.add(Conv2D(16, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(32, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(32, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
# model.add(Dropout(0.25))
model.add(Conv2D(64, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(64, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(128, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(128, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
# model.add(Dropout(0.25))
model.add(Conv2D(128, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
model.add(Conv2D(128, (3,3), padding = "same", activation = "relu", kernel_regularizer = reg, kernel_initializer = init))
model.add(BatchNormalization(axis = -1))
model.add(Flatten())
model.add(Dense(128))
model.add(BatchNormalization())
# model.add(Dropout(0.5))
model.add(Dense(1, activation = 'sigmoid'))
return model
def trainmodel2(model,alldata,aug,aug_val,epochs):
trainX, trainY = alldata[0], alldata[1]
valX, valY = alldata[2], alldata[3]
opt = Adam()
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
checkpoint_path = "content/food/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
history = model.fit(x = aug.flow(trainX, trainY, batch_size=256),
validation_data = aug_val.flow(valX,valY,batch_size=256), epochs=epochs,
batch_size=256)
model.save('/content/food/cnn5')
with open('trainHistoryDict15', 'wb') as file_pi:
pickle.dump(history.history, file_pi)
_, accuracy = model.evaluate(valX, valY)
return accuracy
# evaluate the keras model
# aug = ImageDataGenerator()
alldata = [trainX,trainY,valX,valY]
epochs = 50
# reg = l2(0.0001)
reg = None
init = "glorot_uniform"
# model = getmodel2(h,w,reg,init)
# acc = trainmodel2(model, alldata,aug,aug_val,epochs)
<class 'numpy.ndarray'> <class 'numpy.ndarray'>
import plotly.graph_objects as go
y = np.arange(1,50,1)
colors = ["red", "green", "black", "gray", "yellow", "orange", "purple", "brown","blue"]
plt.figure(figsize=(15,15))
# i = 0
fig = go.Figure()
filer = "/content/drive/MyDrive/foodtask/histories/trainHistoryDict15"
hist = pickle.load(open(filer,"rb"))
mod_n = 15
fig.add_trace(go.Scatter(x=y, y=np.array(hist["val_accuracy"]), name = ("val acc model own2"), line = dict(color = colors[1], dash = 'dot')))
fig.add_trace(go.Scatter(x=y, y=np.array(hist["accuracy"]), name = ("train acc model own2"), line = dict(color = colors[0], dash = 'dash') ))
# plt.ylim([0,1])
fig.add_trace(go.Scatter(x=y, y=np.array(hist["val_loss"]), name = ("val loss model own2"), line = dict(color = colors[5], dash = 'dot')))
fig.add_trace(go.Scatter(x=y, y=np.array(hist["loss"]), name = ("train loss model own2"), line = dict(color = colors[2], dash = 'dash') ))
fig.update_yaxes(range=[0,1])
<Figure size 1080x1080 with 0 Axes>
If you can visualize the INTERATIVE Plotly graph, check the satatic one below
!unzip "/content/drive/MyDrive/foodtask/cnn5.zip" -d /content/cnn
Archive: /content/drive/MyDrive/foodtask/cnn5.zip creating: /content/cnn/assets/ inflating: /content/cnn/keras_metadata.pb inflating: /content/cnn/saved_model.pb creating: /content/cnn/variables/ inflating: /content/cnn/variables/variables.data-00000-of-00001 inflating: /content/cnn/variables/variables.index
modelown2 = tensorflow.keras.models.load_model("/content/cnn")
modelown2.summary()
x_valid = np.array(datate)
x_valid = x_valid.reshape((len(x_valid), h,w, 1))
y_pred = modelown2.predict(x_valid)
y_pred = np.where(y_pred < 0.5, 0, 1)
cm_own2 = confusion_matrix(y_valid, y_pred)
dispfullyown2 = ConfusionMatrixDisplay(confusion_matrix=cm_own2,
display_labels=[0,1])
dispfullyown2.plot()
plt.grid(False)
plt.show()
cl_reportOwn = classification_report(y_valid,y_pred)
print(cl_reportOwn)
precision recall f1-score support
0.0 0.76 0.96 0.85 1601
1.0 0.95 0.71 0.81 1678
accuracy 0.83 3279
macro avg 0.86 0.83 0.83 3279
weighted avg 0.86 0.83 0.83 3279
model = modelown2
for i in range(3):
img_path = "/content/drive/MyDrive/foodtask/ImagesTest/1/" + str(i) + ".jpg"
img_size = (60,80)
last_conv_layer_name = "conv2d_41"
ig = cv2.imread(img_path)
#### For grayscale own architecture:
gs_image = cv2.cvtColor(ig, cv2.COLOR_BGR2GRAY)
gs_image = cv2.resize(gs_image,(80,60))
gs_image = gs_image.reshape((1,h,w,1))
img_array_gs = asarray(gs_image)
img = img_array_gs
img_array = preprocess_input(get_img_array(img_path, size=img_size))
layer_name = last_conv_layer_name
grad_cam=GradCam(model,img,layer_name)
grad_cam_superimposed = superimpose(img_array.reshape((60,80,3)), grad_cam, 0.5, emphasize=True)
prediction,classe_x,pred, _ = make_prediction(model, img_path, None, None)
img = cv2.imread(img_path)
plt.figure(figsize=(15, 15))
ax = plt.subplot(1, 2, 1)
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
plt.axis('off')
plt.title('Original Image')
ax = plt.subplot(1, 2, 2)
plt.imshow(grad_cam_superimposed)
plt.axis('off')
plt.title('Conv_1 Grad-CAM heat-map')
plt.tight_layout()
print("prediction : ", ("Not food" if prediction < 0.5 else "Food"), prediction)
prediction : Food [[0.6117783]] prediction : Food [[0.9781345]] prediction : Not food [[0.00065145]]
from gradcam import *
img_path = "/content/drive/MyDrive/foodtask/ImagesTest/1/1.jpg"
img_size = (60,80)
img = cv2.imread(img_path)
plt.figure(figsize=(15, 15))
ax = plt.subplot(1, 4, 1)
plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
# plt.imshow(img.reshape((60,80,3)))
plt.axis('off')
plt.title('Original Image')
ax = plt.subplot(1, 4, 2)
plt.imshow(grad_cam_superimposed_vgg)
plt.axis('off')
plt.title(f'Block5_conv3 vgg16 archit')
ax = plt.subplot(1, 4, 3)
plt.imshow(grad_cam_superimposed_own)
plt.axis('off')
plt.title(f'Conv2d_17 Own archit')
ax = plt.subplot(1, 4, 4)
plt.imshow(grad_cam_superimposed_own2)
plt.axis('off')
plt.title(f'Conv2d_41 Own archit 2nd model')
plt.tight_layout()
What if we could drastically reduce the input variables and still have a reasonable model to predict?
While DNN are so commonly used when task is to detect and classify images, some other approaches exist and sometimes are worth checking!
For the sake of curiosity, let's try a bootstrap sampling with 5 samples, only 150 input variables (PCA) and half the training data!
As we are going to train a Decision Tree Classifier, we'll be using the data_stack data, as image is unrolled in one dimension.
data_stack["Train/"].shape
(13113, 4801)
Reducing the dimension using PCA.
Let's check how much of information we conserve using 150 variables (a drastic change of 97% of leave out)
from sklearn.decomposition import PCA
import numpy as np
pca = PCA(150)
pca.fit(data_stack["Train/"][:,:-1])
plt.plot(np.cumsum(pca.explained_variance_ratio_))
plt.xlabel('number of components')
plt.ylabel('cumulative explained variance');
With 150 components, our data conserve around 81% of the information!
Now, we transform our original data so to take only these 150 components:
##### New and transformed DATA:
xtrain = data_stack["Train/"][:,:-1]
comp = pca.transform(xtrain)
comp.shape
(13113, 150)
from sklearn.utils import resample
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from matplotlib import pyplot
import time
import random
iter = 5 #### design sampling
stats = []
x = np.concatenate((xtrain,y_train),axis=1)
## the size of our training:
n_size = int(0.5*len(x))
st = time.time()
for _ in range(iter):
dat = resample(x,n_samples=n_size)
(trainx, valx, trainy, valy) = train_test_split(dat[:,:-1], dat[:,-1], stratify = dat[:,-1],
test_size=0.2, random_state=42)
model = DecisionTreeClassifier(random_state = 1234)
model.fit(trainx,trainy)
pred = model.predict(valx)
score = accuracy_score(valy,pred)
# score
stats.append(score)
end = time.time()
print("Time consumed : ", np.round((end - st),3))
print(f"Mean accuracy : {np.mean(stats):.3f}")
stats
Time consumed : 149.351 Mean accuracy : 0.751
[0.7416158536585366, 0.739329268292683, 0.7454268292682927, 0.7644817073170732, 0.7629573170731707]
Nevertheless, due to Tensorflow version incompability, SHAP values couldn't be used here as downgrading Tensorflow would require remodeling all the code for training the models.
references:
threads reporting the issues:
https://github.com/tensorflow/probability/issues/540